Skip to content

Implement basic BPE tokenizer#1

Open
HYW1083 wants to merge 1 commit intomainfrom
codex/create-document-for-chapter-two-bpe
Open

Implement basic BPE tokenizer#1
HYW1083 wants to merge 1 commit intomainfrom
codex/create-document-for-chapter-two-bpe

Conversation

@HYW1083
Copy link
Copy Markdown
Owner

@HYW1083 HYW1083 commented Aug 18, 2025

Summary

  • add Byte Pair Encoding training and tokenizer implementation
  • expose functions through test adapters

Testing

  • pytest tests/test_train_bpe.py -q (fails: ModuleNotFoundError: No module named 'numpy')

https://chatgpt.com/codex/tasks/task_e_68a35d17c5a88327805bca1a388ac3a1

@HYW1083 HYW1083 closed this Aug 18, 2025
@HYW1083 HYW1083 reopened this Aug 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant